Noise resistant audio-visual verification via structural constraints
نویسندگان
چکیده
In this paper we propose a piece-wise linear classifier for use as the decision stage in a two-modal verification system, comprised of a face and a speech expert. The classifier utilizes a fixed decision boundary that has been specifically designed to account for the effects of noisy audio conditions. Experimental results show that in clean conditions the proposed classifier is outperformed by a traditional weighted summation decision stage (using both fixed and adaptive weights); however, in high noise conditions the classifier obtains better performance than the fixed approach and has similar performance as the adaptive approach, with the advantage of having a fixed (non-adaptive) structure.
منابع مشابه
Using lip features for multimodal speaker verification
With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics is viewed as an essential component of current and evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional ...
متن کاملRobust Audio-Visual Person Verification Using Web-Camera Video
This thesis examines the challenge of robust audio-visual person verification using data recorded in multiple environments with various lighting conditions, irregular visual backgrounds, and diverse background noise. Audio-visual person verification could prove to be very useful in both physical and logical access control security applications, but only if it can perform well in a variety of en...
متن کاملLabeling audio-visual speech corpora and training an ANN/HMM audio-visual speech recognition system
We present a method to label an audio-visual database and to setup a system for audio-visual speech recognition based on a hybrid Artificial Neural Network/Hidden Markov Model (ANN/HMM) approach. The multi-stage labeling process is presented on a new audiovisual database recorded at the Institute de la Communication Parlée (ICP). The database was generated via transposition of the audio databas...
متن کاملHuman-Robot Interaction in Real Environments by Audio-Visual Integration
In this paper, we developed not only a reliable sound localization system including a VAD (Voice Activity Detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate three systems in the human-robot interaction to compensate errors in the localization of a speaker and to reject unnecessary speech or noise signals e...
متن کاملStructurally noise resistant classifier for multi-modal person verification
In this letter we propose a piece-wise linear (PL) classifier for use as the decision stage in a two-modal verification system, comprised of a face and a speech expert. The classifier utilizes a fixed decision boundary that has been specifically designed to account for the effects of noisy audio conditions. Experimental results on the VidTIMIT database show that in clean conditions, the propose...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003